fix(fiber): split DA Submit at Fibre's 128 MiB upload cap + duration log by walldiss · Pull Request #3307 · evstack/ev-node

walldiss · 2026-05-02T23:46:23Z

Issue

Under sustained txsim load the DA submitter batched up to 10 pending data items into a single Upload() call, producing a flat payload of ~144 MiB. Fibre's per-upload server-side cap is hard at ~128 MiB (blob size exceeds maximum allowed size: data size 144366912 exceeds maximum 134217723) and rejected every batched upload. With MaxPendingHeadersAndData=10 that took down 170 consecutive submissions before the daemon halted itself with Data exceeds DA blob size limit.

The Submit path also had no per-call observability — failures showed up as DeadlineExceeded or oversized blob after the fact, with no measurement of how long uploads actually took. During load-test debugging this turned into a guessing game over whether RPCTimeout, pending cap, or batch sizing was the right knob to turn next.

Solution

fiberDAClient.Submit: wrap the fiber.Upload call in a chunker (chunkBlobsForFibre) that groups input blobs into ≤120 MiB chunks (8 MiB headroom under Fibre's 128 MiB cap for flattenBlobs's per-blob length-prefix overhead) and uploads each chunk separately. Aggregates submitted counts and BlobIDs across chunks; on first chunk failure, returns the error with the partially-submitted count so the submitter's retry/backoff sees a coherent state.
Per-Submit upload-duration log (info on success, warn on failure): duration, flat blob bytes, blob count, chunk index. Cheap (one time.Since) and gives the operator concrete numbers — e.g. 17 blobs / 115 MiB / 1.5 s — to reason about whether the upload pipeline or something downstream is the bottleneck.
evnode-fibre: block.SetMaxBlobSize(120 → 100 MiB). Companion safety: after the chunker splits a multi-blob batch, a single oversized blob would still end up alone in its own chunk and fail server-side. Capping per-block data at 100 MiB ensures even a single block_data item fits in one Fibre upload.

Test plan

No more data size N exceeds maximum 134217723 rejections under sustained load
No more single item exceeds DA blob size limit halts
Per-submit upload duration line in evnode logs

The Fibre Submit path was opaque: failures showed up as DeadlineExceeded with no signal of how long the upload actually took, and successes only logged at debug level inside the upstream library. During load-test debugging this turned into a guessing game — was the cluster slow, the deadline too tight, or something stuck mid-RPC? Add a single info-level (warn-on-failure) log line in fiberDAClient.Submit covering the Upload call: duration, flat blob bytes, blob count. Cheap (one time.Since) and gives the operator concrete numbers — e.g. "17 blobs / 115 MiB / 1.5 s" — to reason about whether RPCTimeout, pending cap, or batch sizing is the right knob to turn next.

Under sustained txsim load (~50 MiB/s) the DA submitter batched 10 block_data items into one Upload(), producing a flat payload of 144 MiB. Fibre's per-upload cap is hard at ~128 MiB ("blob size exceeds maximum allowed size: data size 144366912 exceeds maximum 134217723") and rejected every batched upload. With MaxPendingHeadersAndData=10 that took down 170 consecutive submissions before the node halted itself with "Data exceeds DA blob size limit". Wrap the Upload call in a chunker that groups input blobs into ≤120 MiB chunks (8 MiB headroom under Fibre's cap for the per-blob length-prefix overhead added by flattenBlobs) and uploads each chunk separately. Aggregates submitted counts and BlobIDs across chunks; on first chunk failure, returns the error with the partially-submitted count so the submitter's retry/backoff logic sees a coherent state instead of all-or-nothing. Single oversized blobs (already validated against DefaultMaxBlobSize earlier in Submit) still land alone and fail server-side, but at least don't drag healthy peers into the same rejected batch.

Companion to the submitter chunking fix. The submitter can split a multi-blob batch into ≤120 MiB Fibre uploads, but a *single* block_data item that exceeds 128 MiB still ends up alone in its own chunk and fails server-side ("blob size exceeds maximum allowed size"). Lower the per-block cap to 100 MiB so under high-throughput txsim a single block can't grow past Fibre's hard limit, and update the comment to explain the relationship between this cap and Fibre's ~128 MiB upload reject threshold.

coderabbitai · 2026-05-02T23:46:29Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a377a03-cfee-4c25-90b7-345b911eb2bc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

julienrbrt · 2026-05-03T08:39:39Z

+// Fibre Upload call. Fibre rejects payloads above ~128 MiB
+// ("data size N exceeds maximum 134217723"); 120 MiB leaves slack for
+// flattenBlobs's per-blob length prefixes and for any future overhead.
+const fibreUploadChunkBudget = 120 * 1024 * 1024


Makes sense, https://github.com/evstack/ev-node/blob/main/block/internal/common/consts.go#L11-L12 would be pre-prefixes 120mb.
Nice catch!

julienrbrt · 2026-05-03T08:42:34Z

+	// Set the per-block data cap below that so each block_data item
+	// fits in a single Fibre upload after the submitter splits a
+	// multi-blob batch into ≤120 MiB chunks.
+	block.SetMaxBlobSize(100 * 1024 * 1024)


👍🏾 can we update the ldflags here for consistency: https://github.com/celestiaorg/x402-risotto/blob/main/scripts/run-stack.sh#L61 ?

walldiss added 3 commits May 3, 2026 01:42

github-actions Bot assigned walldiss May 2, 2026

julienrbrt approved these changes May 3, 2026

View reviewed changes

julienrbrt merged commit d5f981c into evstack:julien/fiber May 3, 2026
18 of 25 checks passed

julienrbrt reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(fiber): split DA Submit at Fibre's 128 MiB upload cap + duration log#3307

fix(fiber): split DA Submit at Fibre's 128 MiB upload cap + duration log#3307
julienrbrt merged 3 commits intoevstack:julien/fiberfrom
walldiss:pr3-fibre-128mib-cap

walldiss commented May 2, 2026

Uh oh!

coderabbitai Bot commented May 2, 2026

Review skipped

Uh oh!

Uh oh!

julienrbrt May 3, 2026 •

edited

Loading

Uh oh!

julienrbrt May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

walldiss commented May 2, 2026

Issue

Solution

Test plan

Uh oh!

coderabbitai Bot commented May 2, 2026

Review skipped

Uh oh!

Uh oh!

julienrbrt May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

julienrbrt May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

julienrbrt May 3, 2026 •

edited

Loading